1,794 research outputs found
Defining a canonical unit for accounting purposes
Compute resource providers often put in place batch compute systems to
maximize the utilization of such resources. However, compute nodes in such
clusters, both physical and logical, contain several complementary resources,
with notable examples being CPUs, GPUs, memory and ephemeral storage. User jobs
will typically require more than one such resource, resulting in co-scheduling
trade-offs of partial nodes, especially in multi-user environments. When
accounting for either user billing or scheduling overhead, it is thus important
to consider all such resources together. We thus define the concept of a
threshold-based "canonical unit" that combines several resource types into a
single discrete unit and use it to characterize scheduling overhead and make
resource billing more fair for both resource providers and users. Note that the
exact definition of a canonical unit is not prescribed and may change between
resource providers. Nevertheless, we provide a template and two example
definitions that we consider appropriate in the context of the Open Science
Grid.Comment: 6 pages, 2 figures, To be published in proceedings of PEARC2
Data Access for LIGO on the OSG
During 2015 and 2016, the Laser Interferometer Gravitational-Wave Observatory
(LIGO) conducted a three-month observing campaign. These observations delivered
the first direct detection of gravitational waves from binary black hole
mergers. To search for these signals, the LIGO Scientific Collaboration uses
the PyCBC search pipeline. To deliver science results in a timely manner, LIGO
collaborated with the Open Science Grid (OSG) to distribute the required
computation across a series of dedicated, opportunistic, and allocated
resources. To deliver the petabytes necessary for such a large-scale
computation, our team deployed a distributed data access infrastructure based
on the XRootD server suite and the CernVM File System (CVMFS). This data access
strategy grew from simply accessing remote storage to a POSIX-based interface
underpinned by distributed, secure caches across the OSG.Comment: 6 pages, 3 figures, submitted to PEARC1
Testing GitHub projects on custom resources using unprivileged Kubernetes runners
GitHub is a popular repository for hosting software projects, both due to
ease of use and the seamless integration with its testing environment. Native
GitHub Actions make it easy for software developers to validate new commits and
have confidence that new code does not introduce major bugs. The freely
available test environments are limited to only a few popular setups but can be
extended with custom Action Runners. Our team had access to a Kubernetes
cluster with GPU accelerators, so we explored the feasibility of automatically
deploying GPU-providing runners there. All available Kubernetes-based setups,
however, require cluster-admin level privileges. To address this problem, we
developed a simple custom setup that operates in a completely unprivileged
manner. In this paper we provide a summary description of the setup and our
experience using it in the context of two Knight lab projects on the Prototype
National Research Platform system.Comment: 5 pages, 1 figure, To be published in proceedings of PEARC2
- …